Search CORE

16 research outputs found

Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities

Author: G.E. Hinton
N.N. Schraudolph
S. Amari
T. Raiko
Y.A. LeCun
Publication venue
Publication date: 01/01/2013
Field of study

Recently, we proposed to transform the outputs of each hidden neuron in a multi-layer perceptron network to have zero output and zero slope on average, and use separate shortcut connections to model the linear dependencies instead. We continue the work by firstly introducing a third transformation to normalize the scale of the outputs of each hidden neuron, and secondly by analyzing the connections to second order optimization methods. We show that the transformations make a simple stochastic gradient behave closer to second-order optimization methods and thus speed up learning. This is shown both in theory and with experiments. The experiments on the third transformation show that while it further increases the speed of learning, it can also hurt performance by converging to a worse local optimum, where both the inputs and outputs of many hidden neurons are close to zero.Comment: 10 pages, 5 figures, ICLR201

arXiv.org e-Print Archive

Crossref

Deep Learning of Representations: Looking Forward

Deep learning research aims at discovering learning algorithms that discover multiple levels of distributed representations, with higher levels representing more abstract concepts. Although the study of deep learning has already led to impressive theoretical results, learning algorithms and breakthrough experiments, several challenges lie ahead. This paper proposes to examine some of these challenges, centering on the questions of scaling deep learning algorithms to much larger models and datasets, reducing optimization difficulties due to ill-conditioning or local minima, designing more efficient and powerful inference and sampling procedures, and learning to disentangle the factors of variation underlying the observed data. It also proposes a few forward-looking research directions aimed at overcoming these challenges

arXiv.org e-Print Archive

Crossref

Accelerating Evolutionary Algorithms With Gaussian Process Fitness Function Models

Author: D. Buche
N.N. Schraudolph
P. Koumoutsakos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

3D Hand Tracking in a Stochastic Approximation Setting

Author: A. Sundaresan
C. Theobalt
G. Borgefors
H. Robbins
J. Carranza
J.R. Blum
L. Wang
M. Bray
N.N. Schraudolph
N.N. Schraudolph
R. Kehl
R.I. Hartley
S. Lu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Crossref

Centering Neural Network Gradient Factors

Author: A. Lapedes
B. Widrow
H.G. Zimmermann
J.B. Tenenbaum
M. Finke
N. Intrator
N.N. Schraudolph
N.N. Schraudolph
P.D. Turney
R. Battiti
R. Battiti
S. Shah
T.J. Hastie
T.J. Sejnowski
T.P. Vogl
Y. LeCun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Crossref

Safe transient operation of microgrids based on master-slave configuration

Author: A.L. Samuel
G. Tesauro
J. Schaeffer
N.N. Schraudolph
R. Sutton
S. Walker
S.P. Singh
Publication venue: IEEE Press. Institute of Electrical and Electronics Engineers
Publication date: 01/01/2004
Field of study

Master-Slave configuration is a suitable alternative to droop control method used in microgrids. In this configuration, only one inverter is the master, while the others are slaves. The slave inverters are always current controlled whereas the master inverter should have two selectable operation modes: current controlled, when the microgrid is connected to the grid; and voltage controlled, when it is operating in island mode. In gridconnected mode, the master needs a synchronization system to perform the accurate control of its delivered power, and, in island mode, it needs a voltage reference oscillator that serves as a reference to the slave inverters. Based on the master-slave concept, this paper proposes a single system that perform both functions, i.e., it can act as a synchronization system or as a voltage reference oscillator depending on an input selector. Moreover, the system ensures a smoothly transition between the two operation modes, guaranteeing the safety operation of the microgrid. Experimental results are provided to confirm the effectiveness of the proposed system.Peer Reviewe

Crossref

RECERCAT

Temporal difference learning and TD-Gammon

Author: Berliner
Fahlman S. E
Fawcett T.E.
Gerald Tesauro
Isabelle J.-F.
Robertie B
Rumelhart D. E.
Schraudolph N.N.
Shannon C.E
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Scalable Neural Networks for Board Games

Author: B. Freisleben
F. Gomez
K.O. Stanley
L. Wu
M. Schuster
N. Hansen
N. Richards
N.N. Schraudolph
P. Baldi
S. Hochreiter
T. Schaul
Y. Lecun
Publication venue
Publication date: 01/01/2009
Field of study

Learning to solve small instances of a problem should help in solving large instances. Unfortunately, most neural network architectures do not exhibit this form of scalability. Our Multi-Dimensional Recurrent LSTM Networks, however, show a high degree of scalability, as we empirically show in the domain of flexible-size board games. This allows them to be trained from scratch up to the level of human beginners, without using domain knowledge

CiteSeerX

Crossref

Towards Adjusting Mobile Devices to User’s Behaviour

Author: A. Silberschatz
B.M. Cantrill
C. Bockermann
C. Sutton
D. Bovet
D. Lohmann
F. Sha
G. Cormode
G. Cormode
J. Huang
J. Nocedal
L.R. Rabiner
N.N. Schraudolph
P. Domingos
R. Malouf
R. Tartler
S. Tian
S.V.N. Vishwanathan
T. Hastie
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Abstract. Mobile devices are a special class of resource-constrained em-bedded devices. Computing power, memory, the available energy, and network bandwidth are often severely limited. These constrained re-sources require extensive optimization of a mobile system compared to larger systems. Any needless operation has to be avoided. Time-consuming operations have to be started early on. For instance, load-ing files ideally starts before the user wants to access the file. So-called prefetching strategies optimize system’s operation. Our goal is to ad-just such strategies on the basis of logged system data. Optimization is then achieved by predicting an application’s behavior based on facts learned from earlier runs on the same system. In this paper, we ana-lyze system-calls on operating system level and compare two paradigms, namely server-based and device-based learning. The results could be used to optimize the runtime behaviour of mobile devices

CiteSeerX

Crossref

Multiple Genetic Snakes for People Segmentation in Video Sequences

Author: A.K. Jain
C.R. Wren
D. Geiger
D.E. Goldberg
D.J. Williams
D.M. Gavrila
G. Storvik
I. Haritaoglu
J.K. Aggarwal
K.W. Cheung
L. Ballerini
L.A. MacEachern
M. Kass
M.H. Yang
N.N. Schraudolph
T. Cootes
T. McInerney
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref